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Abstract 

Given two testable properties V\ and Vi, under what conditions are the union, intersection or set- 
difference of these two properties also testable? We initiate a systematic study of these basic set-theoretic 
operations in the context of property testing. As an application, we give a conceptually different proof 
that linearity is testable, albeit with much worse query complexity. Furthermore, for the problem of 
testing disjunction of linear functions, which was previously known to be one-sided testable with a 
super-polynomial query complexity, we give an improved analysis and show it has query complexity 
0(1/ e 2 ), where e is the distance parameter. 
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1 Introduction 



During the last two decades, the size of data sets has been increasing at an exponential rate, rendering a 
linear scan of the whole input an unaffordable luxury. Thus, we need sublinear time algorithms that read 
a vanishingly small fraction of their input and still output something intelligent and non-trivial about the 
properties of the input. The model of property testing ll33l l22l has been very useful in understanding the 
power of sublinear time. Property testing is concerned with the existence of a sublinear time algorithm that 
queries an input object a small number of times and decides correctly with high probability whether the 
object has a given property or whether it is "far away" from having the property. 

We model input objects as strings of arbitrary length, which can also be viewed as a function on arbi- 
trarily large domain. Formally, let 1Z be a finite set and V = {D n } n> o be a parametrized family of domains. 
1Z V denote the set of all functions mapping from V to 1Z. A property V is simply specified by a family of 
functions V C VP . A tester for property V is a randomized algorithm which, given the oracle access to an 
input function / € TZP together with a distance parameter e, distinguishes with high probability (say, 2/3) 
between the case that / satisfies V and the case that / is e-far from satisfying V. Here, distance between 
functions / ', g : T> — >■ TZ, denoted dist(/, g), is simply the probability that Pi x ^x>[f(x) ^ g{%)], where x 
is chosen uniformly at random from T>, and dist(/, V) = min ffg -p{dist(/, g)}. We say / is e-far from V if 
dist(/, V) > e and e-close otherwise. The central parameter associated with a tester is the number of oracle 
queries it makes to the function / being tested. 

Property testing was first studied by Blum, Luby and Rubinfeld |[T8l and was formally defined by Ru- 
binfeld and Sudan ll33l . The systematic exploration of property testing for combinatorial properties was 
initiated by Goldreich, Goldwasser, and Ron ll22l . Subsequently, a rich collection of properties have been 
shown to be testable E El 13 EU EH S S ESI EH • 

Perhaps the most fundamental question in property testing is the following: which properties have local 
testing algorithms whose running time depends only on the distance parameter e? Are there any attributes 
that make a property locally testable? Questions of this type in the context of graph property testing were 
first raised in [22] and later received a lot of attention. Some very general results have been obtained EUSJCl 
|2T]|3l[T9l, leading to an (almost) complete qualitative understanding of which graph properties are efficiently 
testable in the dense graph model (see lPT4l for some recent progress in the sparse graph model). In addition, 
for an important class of properties, namely //-freeness for fixed subgraphs H, it is known exactly for which 
H , testing H -freeness requires the query complexity to be super-polynomial in 1/e and for which only a 
polynomial number of queries suffice: This was shown by Alon [T| for one-sided error testers and by Alon 
and Shapira (H for general two-sided error testers. Progress toward similar understanding has also been 
made for hypergraph properties 11321 191171. 

However, much less is known for algebraic properties. In a systematic study, Kaufman and Sudan ETl 
examined the query complexity of a broad class of algebraic properties based on the invariance of these prop- 
erties under linear transformations. Roughly speaking, they showed that any locally-characterized linear- 
invariant and linea$\ properties are testable with query complexities polynomial in 1/e. Non-linear linear- 
invariant properties were first shown to be testable by Green ll24l and were formally studied in |[T5l . The 
properties studied in |[24l[T5l are "pattern-freeness" of Boolean functions, which has been attracting consid- 
erable attention ll24l[T5l[34ll29l[T6l . as such a study may lead to a complete characterization of testability 
for functions, analogous to the setting of graphs. 

'A property T is linear if for any / and g that are in T necessarily implies that / + g is in T. 
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1.1 Motivation for set-theoretic operations 

In this paper we propose a new paradigm to systematically study algebraic property testing. First, decom- 
pose a natural algebraic property into the union or intersection (or some other set operation) of a set of 
"atomic properties". Second, try to show that each of these atomic properties is testable. Finally, prove that 
some "composite" property obtained from applying some set theoretic operations on the (testable) atomic 
properties is also testable. A prominent example is the set of low-degree polynomials (4l|26l|23. It is easy to 
see that the property of being a degree-d polynomial over GF(2) is simply the intersection of 2 2<i+1_2 atomic 
properties. Indeed, let P^ denote the set of n-variate polynomials of degree at most d. Then, by the charac- 
terization of low-degree polynomials (see, e.g., El), / £ P<2 if and only if for every x\, . . . , Xd+i € F^, 

f(£xi) = (mod 2). 
0^5c[d+i] ies 

Now fix an ordering of the non-trivial subsets of [d + 1] = {l,2,...,d + l}. Let b be a bit-string 
of length 2 2<i+1 ~ 1 with an odd number of ones and P rf ? denote the set of functions / such that the string 

(f(52i£S x i))%^sc[d+i\ is not equal to b. By definition, P^ is the intersection of 2 2d+1_2 "6-free" properties 

In order to carry out this program of decomposing an algebraic properties into atomic ones, one must 
have a solid understanding of how basic set-theoretic operations affect testability. For instance, given two 
testable properties, is the union, intersection, or set-difference also testable? Previously, Goldreich, Gold- 
wasser and Ron considered such questions in their seminal paper |[22l . They observed that the union of two 
testable properties is always testable (cf . Section 13.1b but also provided examples showing that in general, 
testability is not closed under other set-theoretic operations. Thus, current understanding of testability via 
set-theoretic operations seems insufficient to carry out the above mentioned program of attack. 

1.2 Our results 

In this paper, we show more positive results for these basic set-theoretic operations and illustrate several 
applications. We now describe our contribution in more detail. 

Set-theoretic operations We provide sufficient conditions that allow local testability to be closed under 
intersection and set difference. Given two locally testable properties, we show that if the two properties 
(minus their intersection) are sufficiently far apart, then their intersection is also locally testable. For set 
difference, a similar statement can also be made, albeit with more technicality, requiring that one of the 
properties must be "tolerantly testable". 

A more detailed treatment of these set operations appears in Section [3] We remark that in the general 
case, testability is not closed under most set operations. Thus, putting restrictions on these properties is not 
unwarranted. 

Applications of these set-theoretic considerations appear in Sections 14.21 and 14.31 Furthermore, Sec- 
tion 14.31 demonstrates the simplicity that comes from these set-theoretic arguments. There, via set theory, 
we define a new property from an established one, and show that the new property's testability, in terms of 
both upper and lower bounds, is inherited from the previous property. 

"In fact, some of these 2 properties are identical since the set of non-trivial subsets generated by Xi is invariant under 

permutation of the Xi's. 
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Disjunction of linear functions In addition to set theory, it is also natural to ask whether testability is 
preserved under the closure of some fundamental, unary operations. For instance, given a testable property 
V, under what condition is its additive closure ®V testable? A similar question can also be asked for the 
disjunctive operator A, which is one of the most basic operations used to combine formulas. Given a testable 
property V, is its disjunctive closure KP testable? 

Trivially, if V is linear, then ®V = V and testability is preserved. Furthermore, if V\ and V2 are both 
linear and linear-invariant as introduced by Kaufman and Sudan (271, then their sumset V\ + V2 is testable. 
However, in general, not much can be said about how these basic operations affect testability. 

Here we focus on disjunction's effect on one specific property, namely the set of linear functions. Before 
we describe our result, we note some previous works in testing where disjunction played a role. For the 
disjunction of monomials, Parnas et. al. (32 gave a testing algorithm for s-term monotone DNF with query 
complexity 0(s 2 /e). Diakonikolas et. al. GUI generalized Parnas et. al.'s result to general s-term DNF with 
query complexity 0(s 4 /e 2 ). 

We take a different direction and ask how disjunction affects the testability of the set of linear functions. 
The property of being a linear Boolean function (see next section for a full discussion), first studied by 
Blum, Luby and Rubinfeld (HQ , is testable with query complexity 0(1/ e). As observed in ||T5l . the class of 
disjunction of linear functions is equal to the class of 100-free functions (see Preliminaries for a definition). 
There they showed that a sufficiently rich class of "pattern-free" functions is testable, albeit with query 
complexity a tower of 2's whose height is a function of 1/e. In a different context, the authors in (23l 
showed implicitljH that the disjunction of linear functions is testable with query complexity polynomial in 
1/e, but with two-sided error. 

Since both lTT5l and ll23l seek to describe rich classes of testable Boolean functions, the bounds from both 
works do not adequately address how disjunction affects the query complexity of the underlying property, 
the set of linear functions. In Section 14.11 we give a direct proof, showing that the disjunction of linear 
functions is testable with query complexity 0(l/e 2 ) and has one-sided error. Thus, the blowup from the 
disjunctive operator is 0(1/ e). It will be interesting to see if the blowup is optimal for this problem. 

A different proof for linearity testing Linearity testing, first proposed by Blum, Luby and Rubinfeld |[T8l . 
is arguably the most fundamental and extensively studied problem in property testing of Boolean functions. 
Due to its simplicity and important applications in PCP constructions, much effort has been devoted to the 
study of the testability of linearity ifTHl fT2l rTTl flOl 1281 . 

For linearity, we indeed are able to carry out the program of decomposing an algebraic property into 
atomic pattern-free properties, and thus obtain a novel new proof that linearity is testable in Section 1431 In 
particular, linearity is easily seen to be equal to the intersection of two atomic properties, namely triangle- 
freeness (see Section 13 and disjunction of linear functions, which are both testable. 

The query complexity of linearity in our proof is of the tower-type, drastically worse than the optimal 
0(1/ e) bound, where e is the distance parameter. We note that our effort in obtaining a new proof lies not in 
improving the parameters, but in understanding the relationships among these atomic, testable properties. In 
fact, we believe that despite the poor upper bound, our new proof is conceptually simple and gives evidence 
that set theory may uncover new testable properties. 

3 We thank an anonymous reviewer from ICS 201 1 for pointing this out. 
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1.3 Techniques 

Our new proof that linearity is testable is built on the testability results for triangle freeness (see definition 
in Section |2]) and the disjunction of linear functions. The latter was already shown to be testable in |[T5l . 
However, in this work, we give a completely different proof using a BLR-styled approach. Our proof is 
a novel variant of the classical self-correction method. Consequently, the query upper bound we obtain 
(quadratic in 1/e) is significantly better than the tower-type upper bound shown in lfT51 . In fact, to the 
best of our knowledge, this is the first and only polynomial query upper bound for testing pattern-freeness 
properties. All other analysis for testing pattern-freeness properties apply some type of "regularity lemma", 
thus making tower-type query upper bounds unavoidable. 

We believe that both the self-correction technique and the investigation of set-operations may be useful 
in the study of testing pattern-freeness. From the works developed in 04112911 . we know that for every d, 
the property F d j is testable^ However, for an arbitrary b, the testability of P rf r remains open. And in 
general very little can be said about the testability of an arbitrary intersection of these properties. Since 
is known to be testable using self-correction [4], we believe that self-correction, applied in conjunction with 
set-theory, may be useful for understanding these pattern-free properties. 

2 Preliminaries 

We now describe some basic notation and definitions that we use throughout the paper. We let N = 
{0, 1, . . .} denotes the set of natural numbers and [n] the set {1, . . . , n}. We view elements in F% as n- 
bit binary strings, that is elements of {0, l} n . For x G F?j, we write X{ G {0, 1} for the i th bit of x. If x and 
y are two n-bit strings, then x + y denotes bitwise addition (i.e., XOR) of x and y, and x • y = Ya=i x iVi 
(mod 2) denotes the inner product between x and y. We write (x, y) to denote the concatenation of two bit 
strings x and y. For convenience, sometimes we view a n-bit binary string as a subset of [n], that is, for every 
x G FJ> there is a corresponding subset S x C [n] such that xi = 1 iff i G S x for every 1 < i < n. We write 
\x\ to indicate the Hamming weight of x, i.e., the number of coordinates i such that X{ = 1. Equivalently, 
this is also the cardinality of subset S x . By abuse of notation, we use parentheses to denote multisets; for 
instance, we write (a, a, b, b, b) for the multiset which consists of two tt's and three 6's. 

Let / : F£ -> {0, 1} be a Boolean function. The support of / is supp(/) = {x G F£ : f(x) = 1}. 
Recall that for two functions / and g defined over the same domain, the (fractional) distance between 

these two functions is dist(/, g) d = PT x& x>[f(x) ^ g(x)]. Let V\ and V2 be two properties defined over 
the same domain V, then the distance between these two properties, dist^i, V2), is simply defined to be 
min/ e p 1)Se7 > 2 {dist(/,#)}. 

A Boolean function / : F2 — > {0, 1} is linear if for all x and y in F?j, f{x)+f(y) = f(x+y). We denote 
the set of linear function by "Pun- Throughout this paper, we will be working with the pattern generated by 
the triple (x, y, x+y). To this end, we say that a Boolean function / : Fg — > {0, 1} is (1, 0, 0)-free if for all x 
and y in FJ?, (f{x), f(y),f{x + y)) 7^ (1, 0, 0), where here and after we view (f(x),f(y), f(x + y)) as well 
as (1, 0, 0) as multiseto We denote the set of (1, 0, 0)-free functions by 'P(ioo)-free- Similarly, a (1, 1, 0)- 
free Boolean function is defined analogously. Lastly, we say that a Boolean function / : F?? —> {0, 1} is 
triangle-free if for all x and y in {f{x),f(y),f(x + y)) ^ (1, 1, 1). We denote the set of triangle- 
free functions by 7 ? (ih)-free- Note that 7 , (ih).free is monotone: if / G P(hi)_free an d we modify 

4 Actually, stronger theorems were proved in 11341 1291 . but to state their works in full, definitions not needed in this work will 
have to be introduced. 

5 That is, for example, we do not distinguish the case {f(x),f(y),f(x+y)} = (1, 0, 0) from (f(x), f(y), f{x+y)) = (0,1,0). 
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/ by setting some of the points in FJ> from 1 to 0, then the new function is clearly also triangle-free. We 
encapsulate this observation into the following statement: 

Observation 1. Let f and g be two Boolean functions such that supp(f) C supp(g). Then 

dist{f,V {ul) . FIlEE ) < dist(g,V illiyFnEE ). 

For concreteness, we provide a formal definition of a tester. 

Definition 1 (Testability). Let 1Z be a finite set and V = {-D n }n>o be a parametrized family of domains. 
Let V C 1Z V be a property. We say a (randomized) algorithm T is a tester for V with query complexity 
q(e, n) if for any distance parameter e > 0, input size n and function / : D n — > R,T satisfies the following: 

• T queries / at most q(e,n) times; 

• (completeness) if / G V, then Pr[T accepts] = 1; 

• (soundness) if dist(/, V) > e, then Pr[T accepts] < |, where the probabilities are taken over the 
internal randomness used by T. 

We say that a property is locally testable if it has a tester whose query complexity is a function depending 
only on e, independent of n. In this work, we actually use the word testability to describe the stronger notion 
of local testability. For our main results, we will work with the model case when V n = FJ? and 1Z = {0, 1}. 

3 Basic theory of set operations 

In this section, we present some basic testability results based on set-theoretic operations such as union, 
intersection, complementation, and set-difference. The proofs here are fairly standard and are thus deferred 
to the Appendix. 

3.1 Union 

It is well known that the union of two testable properties remains testable. This folklore result first appeared 
in ll22l ; for completeness, a proof is included in Appendix [A] 

Proposition 1 (Folklore). Let T > \-,T > 2 ^ TlP be two properties defined over the same domain D = {D n } n >o- 
For i = 1,2, suppose V% is testable with query complexity qi(e). Then the union V\ U Vi is testable with 
query complexity 0(qi(e) + (72(e))- 

3.2 Intersection 

The case of set intersection is more complicated than union. Goldreich et al. showed in ll22l (see Proposition 
4.2.2) that there exist testable properties whose intersection is not testable. Thus, in general, testability does 
not follow from the intersection operation. However, testability may still follow in restricted cases. In 
particular, we show that if two testable properties V\ and V2 minus their intersection are sufficiently far 
from each other, then their intersection remains testable as well. A proof is included in Appendix |B] 

Proposition 2. Let Vi, V2 C 1Z V be two properties defined over the same domain T> = {D n }n>o- Suppose 
dist(Pi \ V%, V2 \ Vi) > eo for some absolute constant cq, and for i = 1,2, V% is testable with query 
complexity qi(e). Then the intersection V\ fl V2 is testable with query complexity 0(qi(e) + (/2(e) )> 
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3.3 Complementation 

Here we examine the effect complementation has on the testability of a property. As it turns out, all three 
outcomes - both V and V are testable, only one of V and V is testable, and neither V nor V is testable - are 
possible! 

The first outcome is the easiest to observe. Note that the property and the empty property are 
complements of each other, and both are trivially testable. The second outcome is observed in Proposition 
4.2.3 in f22) . To our knowledge, the third outcome has not been considered before. In fact, previous 
constructions of non-testable properties, e.g. |[22l [131 . are sparse. Hence, the complements of these non- 
testable properties are trivially testable (by the tester that accepts all input functions). One may wonder if in 
general the complement of a non-testable property must also be testable. We disprove this in the following 
proposition. 

Proposition 3. There exists some property V C VP where 1Z = {0, 1} and T> = {F^ } n >o> such that neither 
V nor V is testable for any e < 1/8. 

By utilizing coding theory, we can bypass the sparsity condition to prove Proposition [3] Essentially, 
property V consists of neighborhoods around functions that have degree n/2 — 1 as polynomials over F21. 
Its complement contains functions that are polynomials of degree n/2. Since d evaluations are needed to 
specify a polynomial of degree d, any tester for V or V needs (roughly) at least n/2 queries. Using a 
standard argument involving code concatenation, one can construct V and V to be binary properties that 
require testers of query complexity f2(2 n / 2 ). A formal proof can be found in Appendix ICl 

3.4 Difference 

Let V\ and V2 be two properties and let V = V\ \ V2 denote the set difference of the two properties. In this 
section, we confine our attention to the simple case that V2 C V\. Since complementation is a special case 
of set-difference, from Section [331 we know that in general we can infer nothing about the testability of V 
from the fact that both V\ and V2 are testable. However, under certain restrictions, we still can show that V 
is testable. 

First we observe a simple case in which V\ \ V2 is testable. This simple observation, which is obvious 
and whose proof we omit, is utilized in the proof of Theorem|4]in Section 1431 

Observation 2. Let V2 C V\ be two testable properties defined over the same domain T> = {-D n }n>o- V 
for every f E V2, there is some g 6 V\ \ V2 such that dist(f, g) = o(l), then V\ \ V2 is testable by the same 
tester which tests property V\. 

Our second observation on set difference relies on the notion of tolerant testing, introduced by Parnas, 
Ron, and Rubinfeld QUI to investigate testers that are guaranteed to accept (with high confidence) not only 
inputs that satisfy the property, but also inputs that are sufficiently close to satisfying it. 

Definition 2 (Tolerant Tester QUI ). Let < e\ < £2 < 1 denote two distance parameters and V C VP be 
a property defined over the domain V = {D n } n> Q. We say that property V is (ei, ^-tolerantly testable 
with query complexity q(e\, €2) if there is a tester T that makes at most q(e\, €2) queries, if for all / with 
dist(/, V) < ex, T rejects / with probability at most 1/3, and for all / with dist(/, V) > €2, T accepts / 
with probability at most 1/3. 

We record in the following proposition that if V and V2 are sufficiently far apart and V2 is tolerantly 
testable, then V is also testable. We include a proof in Appendix [D] 
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Proposition 4. Let e\ < €2 < eo be three absolute constants. Let V2 C V\ C 72. &e fwo properties defined 
over the same domain T> = {D n } n> Q. If for every e > 0, P\ is testable with query complexity (71(e), P2 
is (ei, 62) -tolerantly testable with query complexity ^(ei, £2). aw <^ dist(V\ \ V2, V2) > eo» \ ^2 ' s 

testable with query complexity 0{q\{e) + </2(ei 5 £2)) (and completeness 2/3). 

We note that since V2 is tolerantly testable, it does not have completeness 1, Thus, the set difference 
V\ \ V2 is not guaranteed to have one-sided error, either. 

4 Main results 

In this section we show two applications of the results developed in Section [3] We stress that set theoretic 
arguments may be used to show both upper bound results (some properties are testable with only a few 
number of queries) and lower bound results (some properties can not be tested by any tester with less than 
certain number of queries). 

4.1 Testing disjunction of linear functions 

In this section, we employ a BLR-style analysis to show that the class of disjunction of linear functions is 
testable with query complexity 0(l/e 2 ). We first recall from fl5l that a function is a disjunction of linear 
functions iff it is (1, 0, 0)-free. (Recall that 7 3 (ioo)-free is the set of Boolean functions that are free of 
(1, 0, 0)-patterns for any x, y and x + y in F^.) 

Proposition 5 ( lfl5l ). A function f : F?> — > {0, 1} is (l,0,0)-free if and only if f is the disjunction (OR) of 
linear functions (or the all 1 function). 

Proof. The reverse direction is obvious. For the forward direction, let S = {x € : f(x) = 0}. If S is 
empty, then / is the all 1 function. Otherwise let x and y be any two elements in S (not necessarily distinct). 
Then if / is (1, 0, 0)-free, it must be the case that x + y is also in S. Thus S is a linear subspace of F§. 
Suppose the dimension of S is k with k > 1. Then there are k linearly independent vectors 01, . . . , Oft € Fg 
such that z <G S iff {z ■ a\ = 0} f\ ■ ■ ■ f\{z ■ at = 0}. Therefore, by De Morgan's law, f(z) = 1 iff z e S 
iff {z ■ a\ = 1} V • • • \J{z ■ cifc = 1}, which is equivalent to the claim. □ 

7 3 (ioo)-free was shown to be testable with a tower-type query upper bound in |[T5l . We now give a 
direct proof that 7 3 (ioo)-free is testable with a quadratic upper bound. In fact, by symmetry the testability 
°f ^(iio)-free is the same as the testability of 7 ? ( 100 ). FREE . 

Theorem 1. For every distance parameter e > 0, the property 7 ? (ioo) free lJ testable with query complexity 

0(l/6 2 ). 

Proof. Suppose we have oracle access to some Boolean function / : F2 — > {0, 1}. A natural 3-query test 
T for 7 ? (ioo)-free proceeds as follows. T picks x and y independently and uniformly at random from F^, 
and accepts iff (f(x), f(y),f(x + y)) + (1, 0, 0). 

Let R=Pr x ,y[(f(x) J (y),f(x + y)) ± (1,0,0)] be the rejection probability of T. If / G P (10 o)-free, 
then R = 0, i.e., T has completeness 1. For soundness, in a series of steps, we shall show that for every e > 
0, if R < e 2 /128, then there exists a Boolean function g such that (1) g is well-defined, (2) dist(/, g) < e, 
and (3) g is in P (100 )-free- 

Let p,Q denote Pi x [f(x) = 0]. Suppose p,o < 63e/64. Then dist(/, 1) < 63e/64, where 1 is the all-ones 
function. Then trivially, taking g = 1 completes the proof. Thus, henceforth we assume that p,o > 63e/64. 
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For a fixed x G FJ>, let pg denote Pr y [(/(y), f(x + y)) 
define g : F£ — > {0, 1} as follows: 



(0, 0)], and and pf is defined similarly. We 



g{x) 



0, if pg > e/4; 

1, ifpf > e/4; 
J{x), otherwise. 



Proof of (1). g is well-defined. 

Suppose not, then there exists some x E FJ> such that Pq , pf > e/4. Pick y and z independently and 
uniformly at random from F?j . Let i£ be the event that 

at least one of (f(y), f(z),f(y + z)) and (f(x + y), /(<c + z),f(y + *)) is (1, 0, 0). 

By assumption, with probability at least e 2 /16, f(y) = 1, f(x + y) = and /(z), f(x + z) = 0, which 
will imply that - regardless of the value of f(y + z) - event E must occur. Thus, e 2 /16 < PrfE 1 ]. On the 
other hand, by the union bound, Pr[E] < 2R < e 2 /64, a contradiction. □ 



Proof of (2). dist(/,g)<§. 

Suppose x is such that f(x) ^ g(x). By construction, Pr y [f(x), f(y), f(x + y)] > e/4. This implies 
that the rejection probability R is at least dist(/, g) - e/4. Since R < e 2 /128, dist(/, g) < e/32. □ 



Before proving (3), we first note that for every x € F^ , 

PT[(g(x),g(y),g(x + y)) = (1,0,0)] < ^. 
y lb 

To see this, note that by construction of g, for every x e Fg, Prj / [(gi(x), /(y), /(x + y)) = (1, 0, 0)] < e/4. 
Since dist(/, g) < e/32, by the union bound, we can deduce that the probability that g has a (1, 0, 0)-pattern 
at (x, y,x + y) is less than e/4 + 2 • e/32. 



Proof of (3). g is in P(ioo)-free- 

Suppose not, that there exist x,y € Fg such that g{x) = 1, g(y),g(x + y) = 0. Pick z uniformly at 
random from W%. Let E 1 denote the event that 

at least one of (g(x), g(z), g(x + z)), (g(y), g(z), g(y + z)), 
and (g(x + y),g{x + z),g(y + z)) is (1,0,0). 

A case by case analysis reveals that if g(z) = 0, then event E must occur. Note that the probability that 
g{z) = is at least 63e/64 — e/32 = 61e/64, since f(z) = occurs with probability at least 63e/64 and 
dist(/, g) < e/32. On the other hand, by union bound, we have Pi[g(z) = 0] < Pr[E] < 3-5e/16, implying 
that 61e/64 < 15e/16, an absurdity. 

Therefore, we have shown that on any input function that is e-far from "P(ioo)-free> the rejection prob- 
ability of T is always at least e 2 /128. By repeating the basic test T independently 0(l/e 2 ) times, we can 
boost the rejection probability of T to 2/3, and thus completing the proof. □ 
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4.2 A new proof that linearity is testable 

As an application of our results in Section [372] we give a new proof that linear functions are testable based 
on a set-theoretic argument. To this end, note that the set of linear functions equals to the intersection of 
(1, 1, l)-free functions and (1, 0, 0)-free functions, i.e., 

T'LIN = "P(111)-FREE H "P(100)-FREE- 

From the previous section, we know that ^(ioo)-free i s testable. The following theorem due to Green 11241 
asserts that 7 3 (ih)-free i s a l so testable. 

Theorem 2 ( |[24l ). The property ^(iiivfree is testable with query complexity W(poly(l/e)), where for 
every t > 0, W(t) denotes the tower ofTs of height \t\. 

By Proposition |2j to show that linearity is testable, it suffices to show that the two properties P(iii)_free 
and ^(iooJ-free are essentially far apart. To this end, let us define a new property 'Pnltf, where NLTF 
stands for non-linear triangle-freeness: 

"PnLTF = f 'P(lll)-FREE \ ^LIN- 

Lemma 1. We have that P(ioo)-free \ T'lin is \-far from Pnltf- 
We first establish a weaker version of LemmaQ] 

Proposition 6. Suppose f is a disjunction of exactly two non-trivial linear functions. Then dist(f, 'P(iii).free) 
is at least j . 

Proof. Set A = 2". Write f(x) = (a ■ x) \J(j3 ■ x), where a ^ (3 G F<j denote two n-bit vectors not equal 
to n . We say that a tuple (x, y,x + y) where x, y G Fg is a triangle in / if f(x), f(y),f(x + y) = 1. 
We shall show that (1) / has N 2 /16 triangles and (2) for every x, the number of y's such that (x, y,x + y) 
is a triangle in / is A/4. Together, (1) and (2) will imply that dist(/, "P(ih).free) i s at least 1/4, since 
changing the value of / at one point removes at most A/4 triangles. 

To prove these two assertions, let A = {x € Fg : a ■ x = 1} and B = {x £ (3 ■ x = 1}. Since 
supp(/) = AU B, for every triangle (x, y,x + y) in /, each of the three points x,y,x + y must fall in one 
of the following three disjoint sets: 

A\B,(A(1 B), B\A. 

Furthermore, each of the three points must fall into distinct sets. To see this, suppose that x,y G A \ B. 
Then by definition, a(x + y) = a(x) + a(y) = and (3(x + y) = 0, implying that f(x + y) = 0, a 
contradiction. So A \ B cannot contain two points of a triangle, and by symmetry, neither can B \ A. The 
same calculation also reveals that An B cannot contain two points of a triangle. 

Thus, a triangle (x, y, x + y) in / must be such that x £ A \ B, y £ A n B, and x + y £ B \ A. In 
addition, it is easy to check that given two points pi,p2 from two distinct sets (say A \ B and A n B), their 
sum p\ + p2 must be in the third set (B \ A). Since these three sets A \ B, (A fl B), B \ A all have size 
N/4, this implies that the number of triangles in / is A 2 / 16, proving (1). 

(2) also follows easily given the above observations. Suppose x G A \ B. For every y G A n B, 
(x, y,x + y) forms a triangle. Since any triangle that has x as a point must also contain a point in A fl B 
(with the third point uniquely determined by the first two), the number of triangles in / containing x is A/4. 
The case when x£B\Aovx£AC\Bis similar. This completes the proof. □ 
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Now we prove Lemma [TJ 

Proof of LemmaU] Let / G P(ioo)-free \ Plin and write / = f\ V /2, where f\ is a disjunction of 
exactly two linear functions. By Proposition [6l it follows that dist(/i, ^(iiij.free) is at l east 1/4- Since 
P(iii)-free i s monotone and supp(/i) C supp(/), by Observation [JJ we know that dist(/, P(iii).free) — 
1/4. Since Pnltf C P(hi).free. dist(/, Pnltf) > dist(/, P(i U )-free)> completing the proof. □ 

By Theorem [2] and Theorem [TJ both P(iii)_free an(1 P(ioo)-free ^ testable. Now by combining 
Proposition |2] and Lemma[JJ we obtain the following: 

Theorem 3. Plin is testable. 

We remark that the query complexity for testing linearity in Theorem [3] is of the tower type (of the 
form W(poly(l/e)) because of Theorem [2] This is much worse than the optimal linear query upper bound 
obtained in HI HOI. 

4.3 A lower bound for testing non-linear triangle-freeness 

We first show that Plin is a "thin strip" around Pnltf- 

Proposition 7. For any Boolean function f, dist(f, P(iii)-free) > '^(i', Pnltf) — 2~ n . 

Proof. The statement is trivially true if dist(/, Pmiivfree) = dist(/, 'Pnltf)- Since Pnltf is a proper 
subset of P(iii)-free> we can a ssume that dist(/, Pnltf) is strictly larger than dist(/, P(ih)-free). 
implying that the function in P(ih).free tnat h as minimum distance to / is actually in Plin- Call this 
function g. Then it is easy to see that there exists some function h in Pnltf such that dist(g, h) = 2~ n . To 
this end, note that if g is the all-zero function, we can define h such that h{x) = 1 for some x ^ 0™ and 
everywhere else. By construction h is non-linear but triangle-free. If g is a non-trivial linear function, then 
we can pick any x € supp(g) and define h(x) = and h(y) = g(y) for all y / x. By construction h is 
non-linear, and since P(ih).free is monotone, h remains triangle-free. 

Thus, by Triangle inequality, we know that dist(/, P(ih).free) = dist(/, g) is at least dist(/, h) — 2~ n . 
This implies that dist(/, P(ui)-free) > dist(/, Pnltf) - 2" n . □ 

Since any linear function is 2~ n -close to a function in Pnltf , intuitively we expect Pnltf , which is 
obtained by deleting the strip Plin from P(iii)_free> to inherit the testability features of P(ih).free- 
Indeed, we record this next by using the set-theoretic machinery set up in Section [3] 

Theorem 4. Pnltf is testable, but any non-adaptive tester ( with one-sided error) for Pnltf requires 
oj{l/e) queries. 

Proof. We first observe that Pnltf is testable with one-sided error. By Proposition |7] and Observation [2 
the testing algorithm for Pnltf is simply the same as the tester for P(ih).free HU- 

Next we show that the lower bound for the query complexity of Pnltf is the same as Pm i)_free- 
As shown in |fT71 , any one-sided, non-adaptive tester for P(ih).free requires uj(l/e) queriesO Suppose 
Pnltf is testable with one-sided error and has query complexity 0(l/e). Since Plin is testable with query 
complexity 0(l/e) lfl~8l . by Proposition (IJP(iii).free = Plin U Pnltf is testable with one-sided error 
and has query complexity 0(1/ e), a contradiction. □ 

6 A tester is non-adaptive if all its query points can be determined before the execution of the algorithm, i.e., the locations where 
a tester queries do not depend on the answers to previous queries. 

7 The specific lower bound shown in fTTl is f2((i) 1,704 ' ) but can be improved to be Sl((i) 2 ' 423 '" ) as observed independently 
by Eli Ben-Sasson and the third author of the present paper. 
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5 Concluding remarks 



We have initiated a general study of the closure of testability under various set operations. Our results show 
that such a study can lead to both upper and lower bound results in property testing. We believe our answers 
are far from complete, and further investigation may lead to more interesting results. For example, the 

symmetric difference between two properties V\ and V2 is defined to be V\ A Vi = (V\ \ V2) U (V2 \ V\). 
Under what conditions is the property V\ A "P 2 testable if both V\ and V2 are testable? Another natural 
generalization of our approach is to examine properties resulting from a finitely many application of some 
set-theoretic operations. 

Our proof that the class of disjunction of linear functions is testable employs a BLR-style self-correction 
approach. We believe that this technique may be useful in analyzing other non-monotone, pattern-free 
properties. In particular, it will be interesting to carry out our approach of decomposing an algebraic property 
into atomic ones for higher degree polynomials. This will, in addition to giving a set-theoretic proof for 
testing low-degree polynomials, sheds light on how pattern-free properties relate to one another. 

Finally, our quadratic query complexity upper bound for the disjunction of linear functions opens up a 
number of directions. In our work, the blowup in query complexity from the disjunction is 0(1/ e). One may 
vary the underlying properties and the operators to measure the blowup in query complexity. Of particular 
interest may be understanding how the disjunction affects the testability of low-degree polynomials. 
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A Proof of Proposition CD 

Let T\ be the tester for V\ with query complexity qi(e, n) and T2 be the tester for Vi with query complexity 
q2{e,n). We may assume that both T\ and T2 have soundness 1/6 with a constant blowup in their query 
complexity. Define T to be the tester which, on input function /, first simulates T\ and then T2. If at least 
one of the two testers T\ and T2 accepts /, T accepts /. Otherwise, T rejects. 

Clearly the query complexity of T is 0{q\ + 52)- For completeness, note that if / is in V, then by 
definition / is in at least one of V\ and Vi. Thus, T accepts / with probability 1. Now suppose dist(/, V) > 
e. Then we have both dist(/, Vi) > e and dist(/, V2) > e- By the union bound, the probability that at least 
one of T\ and T2 accepts / is at most 1/6 + 1/6 = 1/3. □ 
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B Proof of Proposition |2] 



Let T\ be the tester for V\ with query complexity qi(e), and T2 be the tester for Vi with query complexities 
52(e). First we convert T\ into another tester T{ for V\ such that, on input distance parameter e, T{ makes 
Q[(e) queries, where 



In other words, T[ can be obtained from T\ by making more queries when x is larger than eo/2. Similarly, 
we can construct T' 2 from T2 in the same manner. Since eo is a constant, we have Q\{t) = O ((71(e)) and 



Define T to be the tester that on input function /, first simulates T{ and then T' 2 . If both testers T[ and 
T' 2 accept, then T accepts /. Otherwise, it rejects. The query complexity of T is Qi(e) + Q 2 ( e )> which is 



For the completeness, if / G V, then both / G V\ and / g P2 hold. Therefore, T accepts with 
probability at least 1. For the soundness, suppose dist(/, V) > e. We distinguish between two cases. 

Case 1. e < f . 

It suffices to show that / is e-far from at least one of V\ or V2 • This fact then implies that T, in simulating 
T' and T' 2 , accepts / with probability at most 1/3. 

To show the / is far from at least one of the two properties, suppose not, that we have both dist(/, V\) < 
e and dist(/, V2) < e. That is, there exist 51 6 V\ and 52 G V2 such that dist(/, g\) < e and dist(/, #2) < e. 

Since dist(/, V) > e, 51,52 ^ V an d therefore 51 G V\ \ V and 52 G V2 \ V. By triangle inequality, 
dist (51,52) < 2e < eo, and consequently dist("Pi \ V2, V2 \ V\) < eo, contradicting our assumption. 

Case 2. e > f . 

There are three sub-cases depending on where / is located. We analyze each of them separately below. 
Note that in each of the sub-cases, / is at least eo/2-far from one of V\ and V2- 

1. / G V\\P. Then by our assumption on the distance between V\ \7 3 2 and V2XP1, dist(/, V2XP) > eo- 
It follows that 



2- / G V2 \ V. Analogous to the case above, we have dist(/, V\) > eo/2. 

3. / ^ Vi U V 2 . Then by triangle inequality, max{dist(/, V\ \ V), dist(/, V 2 \ V)} > e /2. So 
there is some i G {1,2} such that dist(/,7 3 i \ V) > e /2. Since dist(/,P) > e, it follows that 
dist(/,^) >min{e, e /2} = e /2. 

Thus, we conclude that there is some i G {1, 2} such that dist(/, Vi) > eo/2. This implies that T(, which 
makes at least Q'^e) > qi(eo/2) queries, accepts / with probability at most 1/3. Hence, T accepts / with 
probability at most 1/3 as well, completing the proof. □ 




Q' 2 (e) = 0(q 2 (e)). 



0(51 (e) + 52(e)). 



dist(/,P 2 ) 



= min{dist(/,P),dist(/,P 2 \^)} 

> min{e, e } 

> e /2. 
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C Proof of Proposition [3] 

We shall define a property V = {-P2fc}fc>o> where 7>2k Q {0, 1} F2 is a collection of Boolean functions 
denned over F 2 fc , such that neither P2k nor i-2fc is testable. Recall that a property V is said to be testable if 
there is a tester for V whose query complexity is independent of the sizes of the inputs to the functions (in 
our case, independent of k). 

First, let the Hadamard encoding Had : F 2 x F 2 -> {0, 1} be Had(a,x) = a ■ x. Note that ¥ 2 k is 
isomorphic to F 2 , so for every function g : F 2 fe — ► F 2fe , the Hadamard concatenation of g can be written as 

Had o g : F 2 fc — > {0, 1} where (Had o g)(x, y) d =Had(g(x),y). 

We now define P 2k as follows. Let / £ P 2 fc if there exists a polynomial p : F 2 fc — > ¥ 2 k of degree 
at most 2 k ~ 1 — 1 such that dist(/, Had op) < 1/8. An important fact is that if g : ¥ 2 k — > ¥ 2 k is a 
polynomial of degree 2 k ~ 1 , then Had og is not in P 2k . To see this, note that by the Schwartz-Zippel Lemma, 
if q : ¥ 2 k — > ¥ 2 k is a polynomial of degree at most 2 fe ~ 1 , then Pr x [q(x) = 0] < 1/2. Therefore, for any 
polynomial p of degree at most 2 k ~ 1 — 1, dist(p, g) > 1/2. This implies that dist(Had o p, Had o g) > 1/4, 
since the Hadamard encoding has relative distance 1/2^| Since the Hadamard encoding of g is at least 1/4- 
far from the Hadamard encoding of any degree 2 k ~ 1 — 1 polynomials, by construction of P 2 k, Had o g is at 
least 1/8-far from P, i.e., Had op g P 2k . 

Now we show that neither P 2 fe nor i ts complement is testable for any distance parameter e < 1/8. By 
polynomial interpolation, for every set of 2 k ~ l — 1 points, there exists a polynomial of degree 2 k ~ l — 1 that 
agrees with g on these points. So any tester that distinguishes between members of P 2k and members at least 
e-far away from P 2k needs at least 2 fc_1 — 1 queries. Similarly, as we have just shown that Had £ P 2k 
when g is a degree-2 fe_1 polynomial, it follows that any tester that distinguishes between members of P 2k 
and functions at least e-far away from P 2k also need at least 2 fc ~ 1 queries. To conclude, we have shown a 
property V denned over domains of sizes \V\ = 2 2k but testing V and V both require Q(2 k ) = J1(|P| 1/2 ) 
queries. Thus, neither the property or its complement is testable with a query complexity independent of the 
sizes of the domains, completing the proof. □ 

D Proof of Proposition 3] 

Let T\ be the tester for V\ with query complexity qi(e) and let T 2 be the tolerant tester for V 2 with query 
complexity (72 (ei, £2)- First we convert T\ into another tester T[ such that, on input distance parameter e, T[ 
makes Q[(e) queries, where 



Set P = V\ \Pi and define its tester T as follows: on input function /, T first simulates T\ and then T 2 . 
T accepts iff T\ accepts and T2 rejects. Since ei is a constant, Q[(e)) = 0(e), and T has query complexity 
0{qi + q 2 ). 

For completeness, if / € V, then by assumption / e V\ and dist(/, V 2 ) > eo > €2- This implies that T\ 
always rejects /, T 2 accepts / with probability at most 1/3, and thus by a union bound argument T accepts 
/ with probability at least 2/3. 

8 In other words, suppose x £ ¥ 2 k satisfies that p(x) 7^ g(x). Then the number of y's such that Had(p(x), y) 7^ Had(g(a;), y) 
is exactly 2 fe_1 . 
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For soundness, suppose dist(/, V) > e. We consider two cases and note that in both of them, T accepts 
/ with probability at most 1/3. 

Casel. dist(/,P 2 ) < d- 

Since T2 is a tolerant tester, T2 rejects / with probability at most 1/3. Thus, T accepts with probability 
at most 1/3 as well. 

Case 2. dist(/, V 2 ) > ei- 

Since V\ is the union of V and V2, we can conclude that dist(/, V\) = min{dist(/, V), dist(/, V2)}, 
which is at least min{e, ei}. Since T[ makes at least max{gi(e), gi(ei)} queries, we know that T[ accepts 
/ with probability at most 1/3, and hence, T accepts / with probability at most 1/3 as well. □ 
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